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§^ ; ABSTRACT 

-H ! The Ginga data for the gamma-ray burst GRB870303 exhibit low-energy dips in 

^ I two temporally distinct spectra, denoted SI and S2. SI, spanning 4 seconds, exhibits 

a single line candidate at ~ 20 keV, while S2, spanning 9 seconds, exhibits appar- 
pq ■ ently harmonically spaced line candidates at ~ 20 and 40 keV. The centers of the time 

intervals corresponding to SI and S2 are separated by 22.5 seconds. We rigorously 
^ . evaluate the statistical evidence for these lines, using phenomenological continuum and 

'nJ" , line models which in their details are independent of the distance scale to gamma-ray 

bursts. We employ the methodologies based on both frequentist and Bayesian statis- 
^^O I tical inference that we develop in Freeman et al. (1999b). These methodologies utilize 

Qs ' the information present in the data to select the simplest model that adequately de- 

0^ ■ scribes the data from among a wide range of continuum and continuum-plus-line(s) 

r^ • models. This ensures that the chosen model does not include free parameters that the 

data deem unnecessary and that would act to reduce the frequentist significance and 
Bayesian odds of the continuum-plus-line(s) model. We calculate the significance of 



Ph' 
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c/3 i the continuum-plus-line(s) models using the x^ Maximum Likelihood Ratio test. We 

' describe a parametrization of the exponentiated Gaussian absorption line shape that 

makes the probability surface in parameter space better-behaved, allowing us to estimate 

/\ • analytically the Bayesian odds. We find that the significance of the continuum-plus- line 

C^ ■ model requested by the SI data is 3.6 x 10~ , with the odds favoring it being 114:1. The 

significance of the continuum-plus-lines model requested by the S2 data is 1.7 x 10~^, 

with the odds favoring it being 7:1. We also apply our methodology to the combined 

(S1-I-S2) data. The significance of the continuum-plus-lines model requested by the 

combined data is 4.2 x 10^^, with the odds favoring it being 40,300:1. 
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1. Introduction 

The cause of gamma-ray bursts (GRBs) remains a mystery, a quarter century after the an- 
nouncement of their discovery by Klebesadel, Strong, & Olson (1973). The recent discovery of 
optical transients associated with GRBs (e.g. van Paradijs et al. 1997 and references therein), 
and the apparent determination of redshifts for five of them— GRB970508 (Metzger et al. 1997), 
GRB971214 (Kulkarni et al. 1998), GRB980613 (Djorgovski et al. 1999), GRB980703 (Djorgovski 
et al. 1998), and GRB 990123 (Kelson et al. 1999)-have indicated that some (if not all) GRBs occur 
at cosmological distances. (While Ockham's Razor might lead one to conclude on the basis of the 
available evidence that all bursts are cosmological, it is important to remember that the GRB sky 
location data themselves do not yet rule out a separate galactic GRB source population; see, e.g., 
Loredo & Wasserman 1995, 1998a,b.) While broad-band observations indicate that relativistically 
expanding fireballs can explain the spectral and temporal behavior of these cosmological transients 
(Goodman 1986; Meszaros & Rees 1997), it is the study of low-energy (^ 100 keV) spectral line 
candidates seen in the spectra of other GRBs that can potentially provide the most powerful means 
both to determine how cosmological and/or galactic GRBs occur and to place constraints on their 
environments. 

Mazets et al. (1980, 1981) were the first to report low-energy spectral line candidates. They 
found single dips and troughs in the spectra of 19 bursts detected by the Konus detectors on Venera 
11 and Venera 12? This corresponds to ?» 15% of the bursts detected by Konus. The statistical 
significances of these features have not been reported. Hueter (1987) then reported single low-energy 
dips with modest statistical significance (~ 10^^) in spectra of two bursts out of 21 detected by the 
HEAO-1 A4 detector. These reports influenced the design of the Los Alamos/ISAS Gamma-Ray 
Burst Detector (GBD; Murakami et al. 1989) on the Ginga satellite. To help analysts differentiate 
spectral lines from changes in continuum shape, a proportional counter (PC) covering the energy 
range ~ 1.5 — 30 keV was included as part of the GBD, in addition to a scintillator counter 
(SC) covering « 15 - 400 keV. (For Konus and HEAO-1 A4, Ei^^ ^ 20 keV.) The spectra of 
three bursts observed by the GBD-GRB870303 (spectrum S2), GRB880205 (spectrum b), and 
GRB890929— were found to exhibit apparently harmonically spaced absorption-like line candidates 
at f» 20 and 40 keV (Murakami et al. 1988, hereafter M88; Fenimore et al. 1988; Yoshida et 
al. 1991). This is out of 23 bursts examined overall. Another spectrum from an earlier epoch 
of GRB870303, denoted SI, was found to exhibit a single absorption- like line candidate at ~ 20 
keV (Graziani et al. 1992, 1993, hereafter G92 and G93 respectively). Analyses of the GRB880205 
and GRB890929 spectra established the significance of the line candidates to be ~ 9 x 10^^ and 
~ 3 X 10^^, respectively (Fenimore et al.; Wang et al. 1989; Yoshida et al.). An analysis of 



^An additional burst with an apparent trough was later determined to be a solar flare; see Atteia et al. 1987. 



GRB870303 established the significance of the hne candidates in the spectra SI and S2 to be 
~ 1.1 X 10~^ and ~ 2.1 x 10~^, respectively (G92; we correct the values they report, as they used 
an incorrect number of degrees of freedom when calculating significances). 

Since Ginga, no GRB detectors possessing the low-energy sensitivity of the GBD have flown. Of 
those that have flown, the ones which are in principle the most capable of detecting line candidates 
are the eight Spectroscopy Detectors (SDs) of the Burst and Transient Source Experiment (BATSE), 
on the Compton Gamma-Ray Observatory. The gain settings of the individual SDs differ; those 
with the highest gain settings can, in principle, observe GRBs at energies ^ 10 keV. An electronic 
artifact discovered after launch affects energy calibration such that spectra are distorted in the first 
~ 10 channels above the low-energy cutoff (the so-called "SLED" effect; see Band et al. 1992). 
While this can possibly affect line detection, studies using simulated Ginga line candidate spectra 
indicated that the BATSE SDs were still capable of detecting low-energy spectral line candidates 
(Band et al. 1995). However, no line candidates were definitively detected during initial visual 
searches of those BATSE SD spectra with the largest signal-to-noise ratios (Palmer et al. 1994, 
Band et al. 1996). The criteria for detection included having the candidate appear in the data from 
at least one SD with F-test significance < 10^^, with the contemporaneous data collected in other 
SDs being consistent with the continuum-plus-line(s) model. An automated line candidate search 
algorithm designed by the BATSE SD team (Briggs et al. 1996) was then applied to spectra in 117 
bright bursts for which there is at least one spectrum with signal-to-noise > 5 at ~ 40 keV (Briggs 
et al. 1998). This automated search, which is considerably more sensitive than a visual search, 
yielded 12 candidate spectral line candidates for which the change in x^ between the continuum 
and continuum-plus-line fits is > 20 (significance < 5 x 10^^). All candidates are emission-like lines 
at ~ 40 keV, with one absorption-like line candidate at « 60 keV. While Briggs et al. estimate 
the ensemble chance probability of the most-significant feature as ^ 10~^, and state that few of 
these features, if any, result from statistical fluctuations, these should not be considered definitive 
detections, as the contemporaneous data from other SDs is still being examined (Briggs et al. 1999). 

In sum, the Ginga observations provide strong evidence for spectral lines that has as yet 
neither been independently confirmed, nor refuted. There is, however, a theoretical bias against 
the existence of lines, reinforced by the strong evidence supporting a cosmological distance scale for 
GRBs. This has developed because few cosmological burst models have attempted to account for the 
existence of harmonically spaced lines (see, e.g., Stanek, Paczyhski & Goodman 1993, and Ulmer & 
Goodman 1995, who attempt to account for lines by invoking gravitational femtolensing) . However, 
the simple lack of theoretical models does not, nor cannot, rule out the possibility of spectral lines 
in cosmological burst spectra. In the galactic GRB paradigm, harmonically spaced absorption-like 
lines are relatively simple to explain, using cyclotron resonant scattering in the strong magnetic 
field {B ~ 10^^ G) of a neutron star. Quantization of an electron's energy perpendicular to the 
magnetic field B facilitates the formation of harmonically spaced lines with a spacing A£' ~ 11.6i?i2 
keV. (See, e.g, Fenimore et al.; Wang et al.; Alexander and Meszaros 1989; Miller et al. 1991, 1992; 
Isenberg, Lamb, & Wang 1998; and Freeman et al. 1999a, hereafter Paper II.) 



In this paper, we present rigorous methods of statistical inference that the reader may use 
to firmly establish the evidence for spectral lines in GRB spectra, using simple phenomenological 
models that are independent of the underlying physics of, and distance scale to, GRBs. To illustrate 
these methods, we apply them to the spectral line candidates exhibited by the SI and S2 spectra of 
GRB870303. In a companion paper (Paper II) we physically interpret these line candidates within 
the galactic GRB paradigm, using the cyclotron resonant scattering line transfer code originally 
developed by Wang, Wasserman, & Salpeter (1988). 

In §2 we describe the Ginga GBD and its observation of GRB870303. In §3, we present 
a basic introduction to the statistical concepts that we use in this paper. These concepts are 
discussed in greater detail in Freeman et al. (1999b), hereafter Paper III. In that work, we present 
general, rigorous, methodologies that address the problem of establishing the existence of a line 
in a spectrum, that are based upon both the so-called "frequentist," and Bayesian, paradigms of 
statistical inference. We apply both frequentist and Bayesian methodologies in this work to ensure 
robust conclusions. In §4 we describe the method by which we select the simplest continuum model 
that fits to the data outside the line candidate(s) (rather than, e.g., simply assuming a continuum 
spectral shape). We consider a wide range of spectral models, which assures that our conclusions 
are robust. Continuum model selection for Ginga GBD data is complicated by the presence of 
spectral rollover at energies ^ 5 keV, which we do not wish to model, and we show how we adapt 
our method to determine which PC bins may be included in fits. In §5, we describe how we 
select the simplest continuum-plus-line (s) model that adequately fits to the data. We introduce a 
parametrization of the exponentiated Gaussian line in terms of its equivalent width We and full 
width at half maximum Wi , the use of which results in a more well-behaved likelihood surface. The 
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model and its parametrization allows us to treat saturated lines and, in addition, to apply analytic 
Bayesian inference to both saturated and unsaturated lines. We demonstrate the importance of 
applying models with as few free parameters as possible, by applying saturated lines (with two, 
rather than three, free parameters), and/or by harmonically linking parameters between two lines, 
in fits to these moderate resolution data. We compare the selected continuum-plus-line(s) model to 
the selected continuum model to evaluate the frequentist statistical significance, and the Bayesian 
odds in favor, of the best-fit continuum-plus-line(s) models for GRB870303 SI and S2. We also 
determine the frequentist confidence and Bayesian credible regions for the parameters of these 
best-fit continuum-plus-line(s) models. In §6 we discuss our results. 



2. Observation of GRB870303 

We first summarize the characteristics of the Ginga GBD; the interested reader will find more 
details in Murakami et al. (1989). The passively shielded and non-collimated GBD contained two 
co-aligned instruments for detecting GRB photons. The Proportional Counter (PC), used to detect 
low-energy photons, consisted of a 3-cm deep Xe-C02 gas reservoir, with geometric area ~ 63 cm^. 
The Scintillation Counter (SC), used to detect higher-energy photons, consisted of a 1-cm thick 



Nal crystal with geometric area ~ 60 cni^, backed by a 7.6-cm diameter phototube. The entrance 
window of the SC was covered by a 0.2-mm-thick aluminum sheet, whereas the entrance window 
of the PC was covered with a 63.5-micron-thick layer of beryllium, which has greater transparency 
than an aluminum layer of similar thickness at low energies. In both instruments, an incident 
photon triggers an electron pulse; the intensity of the pulse (the pulse height) is then used to 
discern the amount of energy deposited by the photon. Because the photon may not deposit all 
its energy in these detectors, the PC and SC record the number of counts as a function of photon 
energy loss, in 16 and 32 semi-logarithmically spaced bins, respectively. To avoid the effect of 
uncertain discriminator settings, we do not consider the lowest and highest energy- loss bins in each 
detector (Murakami, private communication). Excluding these bins, the PC and SC cover 1.4-23.0 
keV and 16.1-335 keV, respectively, for the gain setting at the time that GRB870303 occurred. At 
the line candidate energy of ~ 20 keV, the energy resolution of the PC and SC are ~ 3.4 and 5 
keV, respectively; at 40 keV, the resolution of the SC is ~ 8.4 keV. 

The GBD detected GRB870303 at 16:23 UT on 3 March 1987. Figure 1 shows burst-mode 
time history data for the PC and SC. The GBD continuously recorded burst-mode data at 0.5- 
second intervals. These data were not stored in memory until a burst was detected, at which 
time the data from 16 seconds prior to the burst trigger until 48 seconds after the burst trigger 
were stored. The peak count rate (determined within a 4 second interval) in the SC is ~ 379 cts 
s~^. The background rate in the SC is ~ 572 cts s~^. In addition to burst-mode data, the GBD 
also continuously recorded the gamma-ray background in real-time mode, which had a coarse time 
resolution (usually 16 seconds). These data are used to estimate the background count rate during 
the burst, in each energy-loss bin. By analyzing 150 seconds of real-time data from before the 
burst, and 220 seconds of real-time data from after the burst, we determine that the background 
amplitude is constant as a function of time throughout the burst interval. 

The background-subtracted spectral data for GRB870303 exhibit line candidates during two 
time intervals, the spectra of which we denote SI and S2 (following G92). Figure 2 shows both 
spectra. SI is constructed from 4 seconds of data, during which the burst had energy fluence 1.3 
X 10~^ erg cm^^ in the bandpass 50-300 keV. (This fluence is estimated from the best-fit model; 
Table 6.) It exhibits a saturated line candidate at ~ 20 keV. S2 is constructed from 9 seconds of 
data, during which the burst had energy fluence 4.5 x 10~^ erg cm~^. It exhibits two harmonically 
spaced line candidates, at ~ 20 and 40 keV. The midpoints of the time intervals from which SI and 
S2 are constructed lie 22.5 seconds apart. 

Neither the PC nor SC could intrinsically determine the angle of incidence of burst photons 
relative to the detector normal, ^inc. The burst detector on the Pioneer Venus Orbiter (PVO) also 
observed GRB870303; combining the photon time-of-arrival information from the Ginga and PVO 
spacecraft limits the possible directions of the burst to an annulus on the sky. The burst photon 
angle of incidence is thus constrained to lie within the range 11.2° ^ ^inc ~ 57.6° (Yoshida, private 
communication, correcting Yoshida et al. 1989). In their analyses of the GRB870303 data, M88, 
G92, and G93 assume Oi^c = 37.7°. Since the shape and amplitude of a model counts spectrum 



that is derived from a given photon spectrum depends sensitively on 9i^c, we treat this angle as a 
freely varying model parameter in this work. Because Ginga response matrices are computed using 
computationally intensive Monte Carlo simulations, we use a grid of fixed values 0.54 < cos ^inc < 
0.98, with A(cos ^inc) = 0.02. This grid is sufficiently dense to allow us to accurately determine 
statistical quantities such as line significance (see, e.g.. Figure 6). 



3. Statistical Principles 

We analyze the line candidates in the spectra of GRB870303 Sf and S2 using both frequentist 
and Bayesian methods of model comparison and parameter estimation. In this section, we provide 
a basic introduction to those elements of frequentist and Bayesian statistical inference relevant for 
the analysis of gamma-ray burst spectral lines. The reader will find more detail on these methods 
in Paper III and references therein. 



3.1. Model Comparison 

3.1.1. Frequentist Method 

The frequentist comparison of two models, the null hypothesis Hq and the alternative hy- 
pothesis Hi, is carried out by constructing a test statistic T, which is usually a function of the 
goodness-of-fit statistics for both models. There are two probability distribution functions, or 
PDFs, which indicate the a priori probability that we would observe the value T, computed assum- 
ing the truth of Hq and -ffi, respectively. The test significance, a, or Type I error, is calculated 
by computing the tail integral of the -f^o PDF from T to infinity. The resulting number represents 
the probability of selecting the alternative hypothesis Hi when in fact the null hypothesis Hq is 
correct; if the number is sufficiently small, we reject Hq in favor of Hi. A common threshold for 
rejecting the null hypothesis is a < 0.05, though in this work we use more conservative threshold 
values. 

For the particular case of GRB spectral analysis, the appropriate sampling distribution for the 
data is the Poisson distribution, and the likelihood function C, the product of Poisson probabilities 
for the data in each bin, given model count rates, provides the best means to assess the viability 
of a model. Hq is the model with no line(s). Hi is the model with line(s), and T = /^ itjI ■ (The 
best-fit point, or mode, in parameter space is where the likelihood function is maximized.) To 
determine the Hq PDF, one would simulate large numbers datasets from the best-fit model for Hq 
(i.e. with the model parameters set to best-fit values), and determine the distribution of observed 
values of T^ira- After the Hq PDF is determined, finding the significance a is trivial. 

However, this process may be computationally intensive. So the frequentist often falls back 
upon the understanding that in the limit of a large number of counts n in a bin, the Poisson 



distribution is very nearly Gaussian with a standard deviation "root-n". This understanding, in 
principle, allows the use of Pearson's x^ statistic, an approximation of L = log £, to assess models: 

The sum extends over N data bins, and rrij and rii are the predicted and observed counts in bin i, 
respectively. The best-fit parameters for a given model are those for which s^ is minimized. This 
statistic has the advantage that analytic formulae may be available to determine line candidate 
significance. Under the same assumption of a paraboloidal log-likelihood function in parameter 
space, s^ is sampled from the x^ PDF (in this paper, we follow the notation of Lampton, Margon, 
& Bowyer 1976, who reserve the symbol x^ for a statistic which is explicitly sampled from the x^ 
distribution). There is some ambiguity in the choice of af two widely used choices are af = rii 
("data variance"), and rrii ("model variance"). We denote fit statistics using these two variance 
choices as s^ and s^, respectively. 

In this work, we compare models Hq and Hi using the x'^ Maximum Likelihood Ratio (x^ 
MLR) test (Eadie et al. 1971, pp. 230-232). In Paper III, we demonstrate that the use of this test 
results in fewer Type I errors than both the -F-test and the x^ Goodness-of-Fit (GoF) test. It is 
also the most powerful test (Eadie et al., pp. 219-220). In order to use it, the simpler model must 
be nested within the more complicated alternative model, i.e. the simpler model must be obtainable 
by setting the extra AP = Pi — Pq parameters of the alternative model to default values, often 
zero. The x"^ MLR test statistic is As^ = s'^{Hq) — s'^{Hi). If the Gaussian approximation is valid, 
p{A.s^\Hq) is given by the x^ distribution for AP degrees of freedom. 

A sufficient condition that s^ is distributed as x^ {s^ ~ x^) is that p{ni\mi) be Gaussian 
with mean rrii and width Uj. This condition is not met if we fit a continuum-only model to data 
that has a pronounced absorption-like or emission-like line candidate, regardless of the choice of 
variance: we will be calculating the significance with which Hq is to be rejected in the regime where 
the x^ approximation to the likelihood breaks down. (This is because a second-order Taylor series 
expansion of the Poisson log- likelihood, as a function of, e.g., 5 = nhz^il^ jg ^ot sufficiently accurate 
when 6 <:, ^Jml] additional terms must be included. It is precisely the second-order expansion 
which can be recast as x^-) Consequently, the significance calculated by looking up As^ in the x^ 
distribution with AP degrees of freedom cannot be expected to agree with the "true" significance, 
i.e., the tail integral of the true i^o PDF determined using the Poisson likelihood function. For the 
case relevant for this paper, absorption-like line candidates, the use of model variances will cause 
the "true" significance to be underestimated,^ while the use of variances derived from the data 
will lead to overestimates of the "true" significance (i.e., a^2MLR,s2^ > (^c-, and ay2^/[Lj^s2 < an)- 
As shown in Paper III, we have found that model variances provide a better estimate of the true 
significance, so we calculate variances from the model in this work. 



*To avoid semantical confusion: the smaller the value of q, the greater the "significance," in a qualitative sense. 
Throughout this paper, we follow the convention that H\ becomes "more significant" as a ^ 0. 



Another problem with the use of the x^ MLR test is the condition that estimates for the values 
of the additional parameters introduced by Hi must be drawn from normal distributions (Eadie 
et al., p. 232), as the line-centroid energy is drawn from uniform distribution over the detector 
bandpass (e.g. a spurious line could just as easily be seen at 100 keV as 20 keV). As discussed 
in Paper III, this tends to lessen the significance of any detected line (i.e. a^^truc > etc)- The 
magnitude of the decrease in significance is dependent on the number of data bins and the width of 
the line. Our simulations indicate that for the specific case of the Ginga GBD, ay2]y[Lj^s2 ~ ai:,truc- 



3.1.2. Bayesian Method 

As noted above, the appropriate sampling distribution for counts data is the Poisson distri- 
bution, and the likelihood function C, the product of Poisson probabilities for the data in each 
bin, given model count rates, provides the best means to assess the viability of a given model M. 
The viability of a model in the frequentist method is assessed in part by maximizing the likelihood 
function, but in the Bayesian method, we integrate the likelihood function over the P-dimensional 
model parameter space. The resulting quantity is called the average likelihood: 

p{D\M,I) = dxp{x\M,I) p{D\M,x,I) = dx p{x\M,I) C{x). (2) 

In this equation, D represents the data, while x represents the freely varying parameters of model 
M, and / represents information relevant to the analysis (e.g. detector bandpass). The likelihood 
is weighted at each point in parameter space by the conditional probability p{x\M,I), called the 
prior probability, or simply the prior. The prior is a quantitative statement of our state of knowl- 
edge about the relative probability of each possible value of the parameter x before the data are 
examined. There is a large body of literature on the subject of how to assign priors (see Loredo 
1992 and references therein), which we will not summarize here. When possible, we prefer to use 
"least informative" uniform priors (i.e. constant amplitude functions), with finite bounds which are 
determined in a physically meaningful way (see Appendix B). 

Bayes' Theorem allows us to calculate the posterior probability p{M\D, I) for model M, given 
its average likelihood: 

piMlD.I) = PiMH)'^^^ (3) 

Here, p{M\I) is the prior of the model itself (as opposed to the values of each of its parameters), and 
p{D\I) is a normalization factor. A large posterior probability indicates support for the given model. 
Instead of computing such a probability directly, we determine the ratio of posterior probabilities 
for any two models within a specified set of models {Mi}. This quantity is called the odds: 

^ p{M2\D,I) ^ p{M2\I)p{D\M2,I) ^ p{M2\I) 
'' piMi\D,I) p{Mi\I)p{D\MiJ) piMi\I) ^'- ^' 

Benefits of computing the odds are that we can ignore the normalization p{D\I), and, if we do 
not have a priori preferences for either model, the model priors p{Mi\I). The ratio of average 



likelihoods, denoted B21 above, is termed the Bayes factor. A Bayes factor of > 10-20 is considered 
strong evidence in favor of the alternative model, while a Bayes factor in excess of 100 is considered 
decisive (see the review by Kass &: Raftery 1995 and references therein). 

Generally, the odds must be computed using numerical methods. However, if the shape of the 
likelihood surface in parameter space is similar to that of a multi-dimensional Gaussian function, 
we may use the Laplace approximation to estimate the average likelihood (see, e.g., Kass & Raftery, 
Loredo & Lamb 1992): 



p{D\Mi, I) = p{x^,i ■ ■ ■ Xi,pJ/)(2^)^'/2y^^^max (5) 

Vi is the covariance matrix, determined by inverting the matrix of likelihood function second deriva- 
tives evaluated at the mode. To derive eq. (5), we first assume that the prior does not vary markedly 
around the mode, so that the prior term in the integrand of eq. (2) may be replaced a constant 
evaluated at the mode. (The hats placed on the parameters x signify that we adopt their values 
at the mode when evaluating the prior.) What is left in the integrand is the integral of Ci, which 
we assume has multi-dimensional Gaussian shape: vCi,max x G{xi). Since G{xi) is an unnormalized 
Gaussian function, its integral is (27r) ''^\/detT^. 

We thus approximate the odds as 



n ALi^./iP/2 / dety2P(a^2,i---X2,P2|J) , 

O21 = e (27r) ' xhrTTT-r- TTT- (6) 

V det¥Lp(xi,i • • -xi^PilI) 

(In this expression we use the log-likelihood, L = log£.) This expression is sufficiently accurate- 
generally correct to within a factor of two even if the likelihood surface deviates somewhat from 
the ideal Gaussian shape-to allow us to draw firm conclusions about the relative ability of the two 
models to represent the observed data. 



3.2. Parameter Estimation 

3.2.1. Frequentist Method 

We employ the method of projection to determine frequentist confidence intervals (Eadie et al.; 
Lampton et al.). For a given parameter, we construct a set of values, and at each point on the set, 
we minimize s^ with respect to the remaining parameters. If the shape of the likelihood surface in 
parameter space is similar to that of a multi-dimensional Gaussian function {C oc exp[— As^jj/2]), 
and if the value of the likelihood at the mode is much larger than the maximum value of the 
likelihood along the parameter space boundary, then the na confidence interval for an individual 
parameter is given by those values of the fixed parameter such that s^ = = s^ ^^j^ + n^. 

If the likelihood surface deviates markedly from the ideal Gaussian shape, such that it cannot 
be cast into that shape by parameter transformation, or if the likelihood surface intersects parameter 
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space boundary near the mode, the formula As^ = n^ will not apply. (In the particular case of 
an otherwise well-behaved likelihood surface which is cut off at a parameter space boundary, the 
value As^ which defines the na confidence interval is < n^; the use of As^ = n^ will thus lead to 
overestimates of the confidence interval size.) In these situations, we must perform simulations to 
accurately determine the confidence intervals. This is because the random variables in frequentist 
theory are the data, and not the model parameters: we may not integrate over parameter space in 
an attempt to make inferences about the parameters. 



3.2.2. Bayesian Method 

We may determine a Bayesian credible interval for a particular parameter x of model M, with- 
out reference to the other, "uninteresting," parameters, collectively denoted x', by marginalizing 
the posterior function p{x,x'\D,I) over the space of parameters x': 

p{x\D,I)(x dx'p{x,x'\I)p{D\x,x',I). (7) 



We use the proportionality symbol because we ignore the normalization factor p{D\I). The credible 

interval is defined as 

^ J^^'dxp{x\D,I) 

" Ln.dxp{x\D,iy ^' 

where z is the desired probability content (e.g. 0.683 for la bounds), and p{xi\D, I) = p{x2\D, T).^ 
Note that in the frequentist theory of confidence intervals, p{xi\D,I) does not have to equal 
p{x2\D,I) (though they are equal in the projection method described above), which means that 
confidence intervals, unlike credible intervals, are not unique. Only when the log-likelihood function 
is paraboloidal in parameter space will the Bayesian method yield the same result as the frequentist 
projection method. 

Even if the shape of the likelihood surface is nearly Gaussian, numerical integration of the 
posterior to determine credible intervals is preferable to the use of the Laplace approximation 
(eq. 6), since the latter may not give sufficiently precise answers. Numerical integration may not be 
feasible, however, if the number of model parameters becomes too large (^ 5). To use the Laplace 
approximation in this case, we would select a grid of fixed values for the parameter x, and at each 
grid point compute 

p{x\D,I) = p{x, X |/)(2vr)^>~^/Vdety(x, x')exp[L(x, x)]. (9) 

After we compute p{x\D, I) at each point on the grid, we would use eq. (8) as before to determine 
credible interval bounds. 



^Here we implicitly assume the likelihood surface is smooth and unimodal, so that any particular value p{x\D, I) 
occurs no more than twice. 
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4. Continuum Analysis 

In order to compare models with and without lines, and to estimate the parameters of the lines 
(if lines are detected), we must specify a continuum model. The radiative processes that produce 
the continuum spectrum of gamma-ray bursts are unknown. Therefore, any physically reasonable 
form for the continuum spectrum is a possibility. And we regard all models of burst continuum 
spectra, even when of a form (such as power law or power law times exponential) produced by known 
radiative processes, as purely phenomenological. We consider a wide range of possible continuum 
models, in order that we may draw relatively robust conclusions from our study. 

An unnecessarily complicated continuum model can reduce the frequentist significance of a 
line, and the Bayesian odds favoring a model with a line over one without. It is therefore important 
to select the simplest continuum model that adequately describes the data. In this section, we discuss 
the procedure that we use to select continuum models. 



4.1. Exclusion of the Energy-Loss Bins Affected by the Candidate Line 

In selecting the simplest continuum model that adequately describes the data, we exclude 
those energy-loss bins associated with the line candidate(s) from the fits, in order not to bias the 
outcome. To determine which energy-loss bins to exclude from fits, we first examine the raw data 
by eye to determine the approximate line-centroid energy, Ec, of a candidate line. An incident 
photon at this energy has probability pi of being recorded as a count in the i*^ energy-loss bin. 
li Pi > 0.1, we exclude the i^^ bin from fits. Using this criterion, we exclude from the continuum 
model fits bins PC 14-15 and SC 2-6 from the spectrum SI and bins PC 14-15, SC 2-6, and SC 
10-14 from the spectrum S2. 



4.2. Continuum Model Selection 

In this section, we illustrate how we apply frequentist statistical methodology to the selection 
of best-fit continuum models for the SI, S2, and combined (S1-I-S2) data. Later, in our Bayesian 
analyses of the line candidates exhibited by these data, we adopt the continuum model selected 
using this frequentist method. We do this because the calculation of Bayesian odds favoring one 
continuum model over another requires the stipulation of limits on the allowed range of each 
continuum parameter, so that we may compute its prior. Because the GRB continuum models are 
entirely phenomenological, it is difficult to place meaningful, physically-motivated, limits on the 
priors (e.g., how should we determine the limits on a power law slope?). We stress that the decision 
not to apply Bayesian methodology to continuum model selection refiects our bias against using 
subjectively chosen priors for model parameters, and should not be viewed by the reader as an 
absolute injunction against the use of the Bayesian methodology to select continuum models when 
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analyzing gamma-ray burst data. 

We show our continuum model selection algorithm in Figure 3. The selection of the best-fit 
continuum model is straightforward (Figure 3a). We fit each of a specified set of continuum models 
Mi to the data. (Because we exclude from these fits the energy-loss bins in the vicinity of the line 
candidate(s), where the model may not represent the data well, and thus where the approximations 
used to derive s^ from C may be violated, we can use the fitting statistic s^ [see §3].) If two 
or more models have the same number of free parameters, we choose the one which fits to the 
data with the lowest value s^. Beginning with the simplest model (i.e. the one with the fewest 
number of free parameters), we compute the significance of As^ for each alternative model by 
computing a^2]y[Li^(As^, AP), where AP is the number of additional free parameters introduced 
by the alternative model. If Oy^MLR is never < 0.01, we select the simplest model; otherwise, we 
choose the simplest alternative model, and compare that model against all remaining more complex 
alternative models. We repeat this process until a continuum model is selected. 

Complicating the process of continuum selection is the fact that the magnitude of photon ab- 
sorption in the beryllium window of the GBD PC, at low energies {E ^ 5 keV), depends sensitively 
upon the burst photon incidence angle ^inc- As previously stated, for GRB870303 this angle lies 
within the interval 11.2° ^ ^jnc ~ 57.6°. Thus, for E ^ 5 keV, we cannot disentangle the absorp- 
tion caused by the window from any rollover intrinsic to the burst spectrum and any absorption 
that may occur in intervening cold interstellar gas. Modeling the spectral rollover thus greatly 
complicates the fitting process while leaving our conclusions about the line candidates essentially 
unaffected. Hence, we add a step to the continuum selection algorithm which allows us to deter- 
mine which energy- loss bins are most effected by the rollover (Figure 3b), and we exclude these bins 
from subsequent line candidate analysis. We model the spectral rollover using phenomenological 
absorption by a cold interstellar gas with a column density Nh (model NH in Table 1). We start 
by fitting to the data in all available PC bins (PC 2-13). If the selected best-fit model includes the 
rollover parameter, we eliminate the lowest-energy bin and repeat the process of model selection, 
continuing until the data select a continuum model without rollover. 

In our fits, we consider four phenomenological continuum models (Table 1). The four-parameter 
"Band et al. model" (Band et al. 1993) adequately describes all BATSE SD spectra to which it 
has been applied. The bandpasses of the BATSE SDs extend to much higher energies than did 
the bandpass of the GBD SC, so the SC data for GRB870303 may be insufficiently informative 
to require that the the exponential cutoff energy, and/or second power law slope, of this model 
be specified. Thus, we consider two simpler models nested within the Band et al. model: a three- 
parameter power-law times exponential (PLE) model; and a two-parameter power law (PL) model. 
We also consider a two-segment broken power law (BPL) because G92 and G93 use it to model the 
continuum of S2. (This complicates the model comparison process because the PLE model is not 
nested within the BPL model, so that they are not directly comparable using the x^ MLR test. 
But we never find the BPL and BPL-I-NH models to be the best-fit models amongst models with 
4 and 5 free parameters, respectively.) In our fits, we vary the logarithms of the normalization and 
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energy parameters, so that the shape of hkehhood surface is more nearly Gaussian. We also apply a 
"pivot" energy of 20 keV (e.g., for the PL model, we use the formula ^ = AE-"" = A'(£;/20)-"). 
This helps reduce the size of the confidence and credible intervals for the highly correlated parame- 
ters A and a, while also improving likelihood surface behavior. Applying this set of models within 
our continuum selection algorithm, we determine that the PL and PLE models are the best-fit 
continuum models for SI and S2 respectively (Table 2). 

We use a similar algorithm to select continuum model(s) and bin ranges for the fit to the 
combined (S1-I-S2) data. This process of model selection is considerably more complicated because 
we must both determine whether the data explicitly request different values for some continuum 
parameters common to both fits (e.g. power law slope, if we fit the PL model to SI and the PLE 
model to S2), and we also must determine whether the data explicitly request separate burst photon 
incidence angles ^inc for the two datasets. We determine that the data select the PL model for SI 
and the PLE model for S2, with separate normalizations and power law slopes (Table 2). 

We note that the PL model fits to the data of GRB870303 SI with s^ = 18.22 for 28 degrees 
of freedom. This value of s^ is strikingly small: if s^^ ~ x^, then the probability of finding this or a 
lower value of s^ is 0.080. (G92, who use s\, a different choice of energy-loss bins, and assume ^inc 
= 37.7°, compute a probability 0.023.) While this is not technically a significantly low value of s^, 
we point out that extensive studies of the GBD were done which demonstrate that instrumental 
effects such as dead time, pulse pileup, or bin overlap in the GBD did not conspire to lower the 
value of s^ (e.g. Graziani 1990). Furthermore, a detailed analysis of GRB870725, a burst which 
occurred while Ginga was passing over the Kagoshima Ground Station, showed that the burst mode 
and real time data were still in complete agreement nearly five months after GRB870303. 



5. Line Analysis 

5.1. Line Model 

When fitting a candidate line in the spectrum of a GRB, one must choose both a line model 
and a parametrization of that model. Astrophysicists often use either an additive Gaussian line, 
AL{E) = C{E) - (3G{E), or an exponentiated Gaussian line, EL{E) = C{E)exp{-PG{E)), to 
model the line. (We use the symbols AL and EL to denote line fluxes so as to avoid confusion with 
the log-likelihood L.) C(E) is the continuum flux, and 

G{E) = exp(-^^^^) (10) 

is the Gaussian line shape; E^ is the line-centroid energy of the n*^ harmonic line; and f3 and a 
are the unnormalized strength and width of the Gaussian. We use the exponentiated line model 
rather than the additive Gaussian model because the flux in the latter can be negative, which is 
unphysical. 
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The exponentiated Gaussian model can be parametrized in different ways; the choice of the 
parametrization affects the shape of the hkehhood surface in parameter space. Figure 4 shows 
contours of constant probabihty density as a function of {f3,a) from fitting to an incident photon 
spectrum with a given set of hue parameters. The contours show that in this particular case (and 
in general) the likelihood surface will not have the shape of a multi-dimensional Gaussian function; 
if it did have this shape, we would observe elliptical contours. (Note that the axes of the ellipses 
are not required to be parallel to parameter axes.) In frequentist statistics, the parametrization of 
the line does not affect the calculated line significance (which depends only on As^), but it does 
affect the computation of confidence intervals. The projection method gives a confidence region 
that is accurate only if the shape of the likelihood surface closely approximates that of a multi- 
dimensional Gaussian. It is also advantageous because it greatly reduces the computational burden 
of calculating credible regions using Bayesian inference by allowing us to use the approximate 
expression in eq. (6) to determine the odds favoring the continuum-plus-line model. 

While many parametrizations are well-behaved when the S/N of the spectrum is large and/or 
the line is strong (but not saturated), parametrizing the line in terms of its equivalent width, We, 
and full- width at half-maximum, Wi , has many advantages compared to parametrizing it in terms 

2 

of f3 and a. First, we find that, when we take two-dimensional slices of the parameter space and 
plot the probability density contours corresponding to 2 and 3a, parametrization of the line in 
terms of We and Wi yields elliptical contours over a much larger range of count rates than does 

2 

parametrization in terms of (3 and a (see Figure 4). Second, it has the added advantage that it is 
more intuitive, in the sense that the visual shape of the line is related more directly to We and 
Wi than to /3 and a. Third, parametrization of the line in terms of We and Wi is useful because 

2 2 

the Wi of the line candidates in the spectra of gamma-ray bursts is typically less than or of order 

2 

the energy resolution of the detector, so that the detector is sensitive to We but not to Wi (see 

2 

below). We discuss the details of this parametrization in Appendix A. 



5.2. Selection of the Line Model 

G93 point out that the standard line parametrization becomes degenerate for saturated lines: 
for such lines, vast ranges of (very large values of) f3 and of (very small values of) a result in 
lines which are virtually indistinguishable from each other using moderate resolution Nal crystal 
spectrometers. For example, if we convolve a Gaussian line of width a with a Gaussian detector 
response of width aR, then for a ^ ctr, the width of the final Gaussian line is « (Tr. Thus a 
saturated line may be adequately described by two parameters, its line-centroid energy E and its 
equivalent width We- How wide a line must be before the third line parameter, the FWHM Wi, 

2 

is requested by the data depends upon how informative they are; it is more likely to be requested 
if the S/N of the spectrum is high. In the present context, this means that the S2 data are more 
likely to request a third parameter than the SI data. 

This is important because inclusion of unnecessary line parameters reduces As^ and AL per 
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line parameter. Since the number of additional parameters in the continuum-plus-line(s) model 
relative to the continuum model affects the frequentist significance, and both the number of ex- 
tra parameters introduced by the continuum-plus-line(s) model and their prior ranges affects the 
Bayesian odds, it is important to use the minimum number of line parameters necessary to describe 
the data adequately. 

The optimal search strategy for detecting a single narrow line is therefore to fit to the data a 
two-parameter saturated line parametrized by (E, Wi), with the ratio We/Wi set to its maximum 

2 2 

value, 1.015 (see Appendix A and Figure 5). We then check whether or not the data is adequately 
described by a saturated line by comparing this fit with one in which the line is parametrized 
in terms of E, We, and Wi (and f3 is constrained to be < (3o)- To optimally detect apparently 

2 

harmonically spaced lines, we want to reduce the number of freely varying line parameters to the 
minimum requested by the data. For two lines, the first step is to assume harmonic spacing between 
the lines; this reduces the number of free parameters from six to five. The next step is to assume 
that each line is saturated; this reduces the number of free parameters from five to three. To reduce 
the number of free parameters to two, we link the width of the first and second harmonics. For 
the purely historical reason that line widths were once interpreted as Doppler widths of absorption 
profiles (e.g. Fenimore et al. 1988), we assume We,2 = 2VFe,i. We could just as easily assume 
We,2 = W^E,i- Since the width of the second harmonic in S2 is smaller than the energy resolution 
at 40 keV, the values of s^ and C are relatively insensitive to the assumed relation between We,i 
and We,2- 

In Table 3 we list the line models that we consider when fitting the spectra SI and S2. 

The procedure that we use to compare these continuum-plus-line models is analogous to that 
used to compare continuum models, but with the following differences: we assume the continuum 
model and the range of PC energy-loss bins that we selected using continuum model comparison; 
and we restore to the fits the energy-loss bins that we excluded earlier because they were near the 
energy of the line candidate. We note that not all the models we use to fit to S2 are nested within 
each other, which precludes using the x^ MLR test to compute significances in some cases; however, 
the selection of the line model was not effected by this. Use of this procedure leads to selection of 
the saturated line model (with parameters E and We) for both SI and S2. 

Because cos ^inc takes on discrete values in our analyses, we must apply a variation of eq. (6) 
in our Bayesian method of line model selection. For any given value of cos^mc, we use the Laplace 
approximation to estimate the parameter space integral; we then sum these integrals over all values 

of COS^inc: 

^ ^ Ecosg,„.P(^2,l-X2,P„COsginc|/)(27r)^^/Vdety2(cOsgine)exp(Lr"[cOsginc]) 

'' Ecos0i„,P(^i,i-^i,P2,cos^inc|/)(27r)^i/VdetVi(cos^i„e)exp(L--[cos^inc])' 

As noted in §3, we assume uniform priors. Appendix B describes how we compute the prior for 
each model listed in Table 3, and Table 4 presents the formulae that we use to compute the prior for 
each model. We need not specify priors for the continuum parameters; the use of the exponentiated 
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Gaussian line model allows us to factor the priors for the line and continuum parameters, so that 
when we form the likelihood ratio, the continuum priors cancel. For the same reason, the prior for 
cos ^inc cancels out of the final expression. 

The odds favoring Model Sl-U over Model Sl-S, and Model S2-B over Model S2-A, is^^ < 1:1, 
indicating a roughly 50% chance that these more complex models are the correct models to select. 
An odds ratio of 1:1 falls far short of the 10:1 odds criterion that would indicate sufficiently strong 
evidence in favor of the more complex models. 



5.3. Application to the Data of GRB870303 

5.3.1. GRB870303 SI 

We estimate the frequentist significance of the spectral feature in GRB870303 SI by comparing 
fits of the PL and PL+(S1-S) models to the data (Tables 5-6, Figure 6). The significance of the 
reduction in s^, for two additional parameters, is a^a^^Lj^ = 3.6 x 10^^. For reasons discussed above 
in §3 and Paper III, this value is not the "true" significance that we would derive by simulating 
vast numbers of datasets, but is expected to be approximately correct. 

In our Bayesian analysis, we apply the PL and PL+(S1-S) models to the data and use the 
modified Laplace approximation eq. (11) to yield an estimate of the odds favoring the continuum- 
plus-line model of 114:1 (Tables 5-6; Figures 6-7). This is strong evidence in support of the line 
hypothesis. As discussed in §3, the use of the Laplace approximation assumes that a likelihood 
surface has ideal multi-dimensional Gaussian form, and our reparametrization of the line model 
helps ensure that the likelihood surface in this analysis has approximately that ideal form. The 
only way to ensure accuracy of the Laplace approximation is to perform numerical integration as 
a check. Unlike the case for computing credible regions, where a portion of the numerical error 
introduced by using sparse grids of parameter values will cancel out because one is computing a 
ratio (eq. 8; see Tierney &: Kadane 1986), accurate numerical computation of the odds requires that 
we use a denser grid of parameter values. Hence, we are computationally limited to performing 
numerical integration only over the five-dimensional parameter space of the PL-l-(Sl-S) model (the 
fifth parameter is the burst photon incidence angle). This integration yields odds ~ 130:1. We 
conclude that the use of our reparametrization and the Laplace approximation is accurate to well 
within a factor of two. 

In Table 7, we show the frequentist confidence and Bayesian credible intervals for the pa- 
rameters of the PL-l-(Sl-S) model. We find that for any given value cos^inc in the allowed range 
[0.54,0.98], the confidence and credible intervals closely match, demonstrating the efficacy of our 



^"in this paper, we follow the accepted Bayesian practice of treating the odds, the ratio of average likelihoods, as 
singular quantity. 
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line model reparametrization. The likelihood surface as a function of cos ^inc is truncated (Figure 
6); while this does not affect the computation of Bayesian credible intervals, it does cause the confi- 
dence intervals for cos ^jnc and any parameter correlated with cos 9[^c (most notably the continuum 
normalization A and slope a) to be overestimated. 



5.3.2. GRB 870303 S2 

We determine the frequentist significance of the spectral features in GRB870303 S2 by compar- 
ing fits using the PLE and PLE-|-(S2-A) models to the data (Tables 5-6; Figure 6). The significance 
of the reduction in s^, for two additional parameters, is OyZ^^Lj^ = 1.7 x 10^^; the odds favoring 
the continuum-plus-lines model is 7:1 (Tables 5-6, Figures 6-7). The line candidates in S2 are not 
detected, if we apply the common criterion that the significance of the candidate line must be < 
10~^. The difference in odds between SI and S2 is due to the S2 data being more informative: 
the errors on line parameters Ec^i and We,i are smaller for S2 than for SI, reducing the average 
likelihood of the continuum-plus-lines model, and the odds. In Table 8 we present frequentist con- 
fidence and Bayesian credible intervals. As seen in Figure 6, the likelihood surface as a function of 
cos ^inc is truncated, leading to differences between the computed intervals that arise for the same 
reason as stated above for SI. 



5.3.3. Joint Fits to the Combined (S1+S2) Data 

We fit to the combined (S1-I-S2) data because fits to this more informative dataset can 
strengthen the statistical evidence favoring the line hypothesis. We use the same continuum- 
plus- lines models that we applied to the 82 data (Table 3), except that now for each model, we 
test whether the data request different parameter values for SI and S2 (i.e. we test Model S2-A 
with E'l^si = -^i,S2; then -Ei^si 7^ -E'i,S2; etc.). Using both frequentist and Bayesian methods, we 
find the best-fit joint line model is the S2-B model, with the values of Ei and H^e,2 equal for SI 
and S2, and We,i,si / l^E,i,S2- (We note that the best-fit two parameter model, the S2-A model, 
was nested within the best-fit three parameter model, the S2-B model, which in turn was nested 
within the modified S2-B model shown above; hence the use of the x^ MLR test was valid at every 
step of frequentist model comparison.) 

The frequentist significance of the reduction in s^, for four additional parameters, is a^^-a^LR = 
4.2 X 10^^, while the odds favoring the continuum-plus-lines model is 40,300:1 (Tables 5-6; Figures 
8-9). In Table 9, we present frequentist confidence intervals for the parameters of the continuum- 
plus-lines fit. We find that while the intervals are somewhat inaccurate because of likelihood 
surface truncation, we can safely conclude that the second harmonic line width is consistent with 
zero: there is not overwhelming statistical evidence favoring the presence of the second line. We 
do not compute credible regions because the number of parameters is too great. We do not use the 
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Laplace approximation to compute approximate credible intervals because we sample the likelihood 
space at only at discrete intervals in cos^inc, which renders difficult the computation of covariance 
matrices. 

We conclude that the joint (S1+S2) dataset thus presents by far the strongest evidence sup- 
porting the hypothesis that spectral lines exist in gamma-ray burst spectra. 



6. Discussion 

In this paper, we analyze the data of GRB870303 SI and S2 using rigorous statistical techniques 
developed for gamma-ray burst line candidate analysis (Loredo & Lamb 1992; G92; G93; Freeman 
et al. 1993, 1994; Paper III). We conclude that the line candidates exhibited by the SI and S2 
data have significances 3.6 x 10^^ and 1.7 x 10""^, respectively, with the Bayesian odds favoring 
the continuum-plus-line(s) model being 114:1 and 7:1, respectively. Fits to the combined (S1-I-S2) 
data show that the best-fit line model has significance 4.2 x 10~®, with the odds favoring it being 
40,300:1. The results of these fits to the combined data makes the line candidates they exhibit the 
most significant yet observed, easily satisfying the most conservative line detection criteria. 

The SI and S2 data were previously analyzed by MBS and G92, who report significances for the 
line candidates in S2 of ~ 10~^ and 2.1 x 10~ , respectively. (We have corrected the significance 
computed by G92, because they assumed the input number of degrees of freedom for the x^ MLR 
model comparison test to be the total number of parameters in the continuum-plus-lines model, 
rather than the number of additional parameters used to parametrize the lines.) Both M88 and 
G92 assume ^inc = 37.7°. The line candidates in S2 are not detected, if we apply the criterion that 
the significance of the candidate line must be < 10"*^ (see, e.g.. Palmer 1994). G92 also discovered 
the line candidate in the spectrum SI, computing its significance to be 1.1 x 10~^ (also corrected). 
G93 use a Bayesian method to analyze the line candidates, and they report the odds in favor of 
the line model to be 110:1 and 2.8:1 for SI and S2 respectively. 

The differences between the analyses of M88 and G92 and our analysis are summarized in 
Table 10. Below, we discuss how each of these differences in turn alters the computed significance 
and odds of the line candidates. Unless otherwise noted, we assume ^inc = 37.7° in all fits that we 
perform below, to facilitate comparison between our results and those derived previously. We note 
that because of this assumption, the derived significances, etc., stated below may differ somewhat 
from analogous values presented in Table 5. 



6.1. Choice of Prequentist Statistic 

We use the s^ statistic with variances derived from the model count rates in each energy-loss 
bin (Sm)- Both M88 and G92 use the s^ statistic with variances derived from the data (s^). As 
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noted in §3 and demonstrated in Paper III, the use of s^ can lead to an overestimation of the 
significance of an absorption hne, relative to that derived using s^. (Note that the exact opposite 
is true for emission lines, where the use of s^ is to be preferred, if the Poisson likelihood cannot be 
used.) The magnitude of the difference between derived significances depends upon the size of the 
analyzed line candidate. We find that if we fit to the SI, S2, and combined (S1+S2) data using s^, 
the calculated line significances are 3.3 x 10~^, 1.5 x 10~^, and 1.5 x 10~^, respectively. If we use 
s^, the respective values are 1.2 x 10"^, 4.0 x 10"^, and 3.1 x 10~^. 



6.2. Choice of Model Comparison Test 

In Paper III, we use simulations to compute model comparison statistic PDFs for the x^ MLR 
test, the F test (used, e.g., by M88), and the x^ Goodness-of-Fit (GoF) test. We use these PDFs 
to determine that the x^ MLR test is the most powerful test of the three, i.e. that the use of 
this particular test will result in the highest rate of line detection, if lines are present in the data. 
Because it is the most powerful test, we use the x^ MLR test in this paper. 

Application of the F test (with test statistic -i^—^) to the SI, S2, and combined (S1+S2) 
data results in significance estimates of 6.7 x 10"^, 3.3 x 10^'^, and 1.1 x 10^^, respectively, as 
opposed to 1.2 X 10"^, 4.0 x 10"^, and 3.1 x 10"^ for the x^ MLR test. The F test renders the 
candidate line in the SI data more significant because of the unusually small value of s^^ for SI 
(22.25 for 34 degrees of freedom). (We note that this does not make the F test more powerful in 
this particular case, because test power is computed from the PDF of the model comparison test 
statistic calculated assuming the truth of the alternative hypothesis, and not from the results of 
fits to a single dataset.) 

The application of the x^ GoF test, in which the s^^, is compared to the x^ distribution for 
N — Pc degrees of freedom, to the SI, S2, and combined (S1+S2) data leads to significances 0.15, 
0.03, and 0.02. The line candidates would not be considered detected, if we assume a significance 
criterion 10~^. We note that if we apply the most generally-used threshold criterion of 0.05, the 
line candidate of SI would still not be detected. This is a result of the unusually small value of s^^ 
for the continuum fit to SI. 



6.3. Use of Model Comparison to Select Continua 

In §4, we describe the method with which we determine the best-fit continuum model for the 
SI, S2, and combined (S1-I-S2) data, while also determining which low-energy PC bins to include 
in fits. Fenimore et al. (1988), in their analysis of GRB880205, were the first to apply a number 
of different continuum models to data. They adopt a three-segment power law as representative 
of all possible models, after determining that the choice of continuum has little effect upon the 
detection of lines in these data. An antecedent of the method we prescribe in this paper was used 
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by G92 and G93, but they did not test either the PLE or Band et al. models, nor did they use 
model comparison to determine the usable range of PC bins. In both works, PC 10 is adopted as 
the lowest usable bin (-Eiow ~ 5.7 keV). 

In Table 11, we show how the results of fits to SI and S2 change if we apply the BPL model 
used by G92, and the Band et al. model, to the data. We also examine how our results change if we 
limit ourselves to fitting to the data in PC 10-14 only, the PC bin range used by G92. We find that 
using the Band et al. model leads to increases in line significance, most notably for the fit to SI. 
While initially encouraging, this result does not actually strengthen the evidence supporting the 
existence of lines in these data, despite the strong Bayesian prior supporting the Band et al. model. 
This is because in addition to the model prior, we must take into account our prior expectation 
for the values of each model parameter, either qualitatively or quantitatively. In the particular 
case of the fit to the data of SI, the inclusion of the (unrequested) second power- law segment 
causes the best-fit model parameter values to deviate strongly from their expected values. For 
instance, the best-fit exponential cutoff energy does not lie within its characteristic range (>100 
keV); instead, it is ~ 10 keV. The model is attempting to fit what remains of the (no longer 
statistically significant) low-energy spectral rollover at energies ~ 5 keV (Figure 10). This causes 
an increase in the continuum fiux at 20 keV, increasing both the equivalent width needed to fit 
the data in the line region (from 9.84 to 10.7 keV) and the significance of the line. The increase 
in significance is greatly reduced when the lowest energy bins PC 8-9 are removed from the fit. 
Also, we find that we cannot compute the odds favoring the continuum-plus-line model if we use 
the Band et al. model. Parameter values along the Band et al. model space boundary defined by 
ai = 0.2 are highly probable with respect to the mode (because the slope of the unrequested second 
power-law segment tends towards the slope of the first power-law segment), and because of this 
boundary, our fitting program was unable to estimate the covariance matrix values for the model 
parameters. 

The fit of the Band et al. model to the SI data demonstrates how including unjustified contin- 
uum parameters in the fit can alter the computed line significance. We feel that Briggs et al. (1998) 
provide another demonstration with their analysis of the emission-like line candidate of GRB941017, 
which was observed by the BATSE SDs. Their philosophy differs from that espoused in this paper: 
they contend that to demonstrate the existence of a line in this (or any) burst, one must show 
that the data require the line regardless of whichever reasonable continuum model is assumed. The 
use of a Band continuum model plus two-parameter line to fit the GRB941017 data collected by 
BATSE SD results in a line candidate significance of 7 x 10^^. After adding a low-energy spectral 
break to the continuum (which introduces two additional continuum parameters), the significance 
is decreased to 0.04. Thus they feel that they cannot prove the spectral feature is a line. We feel 
that there are two problems with this approach, one observational, and the other methodological. 
First, such breaks have not been observed in any other continuum spectrum, in particular those of 
bursts observed by the Ginga GBD, whose low-energy coverage is superior to that of the BATSE 
SDs. Hence we feel that the proposed continuum shape may not be reasonable. Second, Briggs et 
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al. do not use model comparison to justify that the additional continuum parameters introduced 
with the low-energy power law segment are necessary to adequately fit the data. 



6.4. Use of Model Comparison to Select Line Models 

Fitting to data with models that have more free parameters than necessary can lead to a 
marked reduction in derived line significance. The moderate resolution data collected by the Ginga 
GBD lack the informative power to require that each line candidate be fit with a line model 
parametrized by line-centroid energy, equivalent width, and full-width at half-maximum, with each 
value freely varying. This fact has come to be recognized through successive analyses of Ginga 
GBD data. Initially, analyses of harmonically spaced line candidates by M88, Fenimore et al., and 
G92^^ featured models containing two lines with six independently-varying parameters. Yoshida 
et al. 1991 and G92^^ reduce the number of free parameters to five, by testing models in which 
the line-centroid energies are harmonically related {E2 = 2Ei). A further step towards model 
simplification is taken by G93, who test a four-parameter line model with harmonically related 
values of E^ and Wi „; they also test a two-parameter line model in fits to the SI data, assuming 

2,n 

that the line is saturated (Wi ~ We)- In this work, we push model simplification to its limits, 
by testing both the saturated line model of G93 in fits to the SI data and the two- free-parameter 
S2-A model in fits to S2. 



In Table 12, we demonstrate the effect of including unnecessary line model parameters in fits 
to the SI and S2 data, fitting the former with the three-parameter unsaturated line model (Sl-U), 
and the latter with five- and six-parameter models. For SI, the addition of a parameter causes no 
change in s^^ (because the mode lies at a parameter space boundary), whereas for S2, we find that 
each additional parameter lowers s^ by ~ 1, which is just what we expect if we include in the fit 
parameters that the data do not request. 



6.5. Effect of the Parametrization and Choice of Prior on Bayesian Odds 

Our Bayesian analysis differs from that reported in G93 in that we scale the continuum ampli- 
tude and energy cutoff parameters logarithmically. This helps create a likelihood surface that more 
closely resembles a multi-dimensional Gaussian, and thus makes the estimation of the covariance 
matrix more accurate. We find that while this change has little effect upon the odds favoring the 
line hypothesis for SI, it does increase the odds for S2 by nearly factor of two (from 8.7:1 to 14:1, 
for the S2-A model with ^inc = 37.7°, and for our chosen PC bin range). 

^^Specifically, for their computation of the line candidate significance for the S2 data. 
^^For their estimation of parameter values for S2. 
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A. The We-Wi Parametrization 

2 

The equivalent width. We, of an exponentiated Gaussian line G{E) in eq. (10) is 

POO 

= / dEil - exp[G(^)]) (Al) 

Jo 

where 

<^{E,p,a) = I dx 1 - exp(-/3e-^ ) ^ dx 1 - exp(-/3e-^ ) . (A2) 



V^(T 



The approximation ^{E,(3,a) ~ ^(P) is satisfied in all cases where we infer a low-energy line 
candidate in data, breaking down only as /3 ^ exp(^=-). In the following, the limit /? ^ oo has 
the meaning of <C /3 < exp(^=-). 



As /3 ^ 0, the function <l?(/3) has the limit 

lim$(/3) = I dxde-'^ 



= Py/^. (A3) 

As /3 ^ oo, the approximate integrand in eq. (A2) behaves approximately as a box function B{x), 
where 

[ 0, otherwise. 

2 

We estimate Xq by assuming that /3e~^o = 1, or 



xo = Viog^. (A5) 

Thus 



lim $(/?) « 2Vlog/?. (A6) 

/3— >oo 



Differentiating ^{P), we find that for all /3, 

d$ d^^ 



> and -j^ < 0. (A7) 
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We may invert the mapping (3 -^ ^{(3) for all values of /3, although for even moderate values of $ 
the corresponding value of (3 may be very large. It is numerically straightforward to invert <5 and 
conclude the reparametrization (o",/3) -^ (cj,VFe). 

We use the physical full- width at half-maximum of the line itself {not of the Gaussian) to define 
a more physically meaningful parametrization: 



l-exp[G{E + Wi/2)] 1 

l-exp[G(^)] ~ 2' 

We invert this equation, and use eq. (10), to write Wi as: 



(A8) 



It follows that 



and that 



Wi = 2V2a \ fog(/3) - log 



log 



l + e-/3 



limVFi = 2aJrhg2, 
/3-»0 2 



lim Wi = 2CTV2(log/3-log(log2)). 

/3— >oo 2 



(A9) 

(AlO) 
(All) 



The reparametrization ((T,/3) -^ {We,Wi) is simplified by the observation that the ratio r 
We W~[ depends only on (3. From eqs. (Al), (A3), and (AlO), we find 



hm — — = 0, 

/3^0 Wi 

2 



while from eqs. (Al), (A6), and (All), we find 



hm — — 

/3-»oo Wi 

2 



(A12) 



1. 



(A13) 



The ratio We/Wi is nearly, but not quite, a monotonic function of the unnormalized Gaussian 
amplitude (3: it rises sharply from zero, reaching W^/Wi = 1 when (3 ~ 4.75 and peaking at ~ 
1.015 when (3 = (3o ^ 18.7, before tapering off to 1 as /3 ^ oo (Figure 5). Thus the line begins 
to saturate when /? ~ 4.75 and approaches a square well shape as /? ^ oo. An observed line has 
a value of We/Wi which falls in the range < We/Wi < 1.015 (and therefore (3 in the range 

2 2 

< /? < (3o) unless it is highly saturated. We constrain (3 to this range, with little loss in generality, 
as statistical fits made without this constraint will differ very little from those made with it, unless 
the S/N of the line is extremely large. 



24 



B. Bayesian Prior Probability Distribution for the We-Wi Parametrization 

2 

There are no standard rules for determining the range and shape of the prior probability 
distribution in Bayesian methodology (see, e.g., Loredo 1992 and references therein). In this paper, 
we seek to assign priors that are "least informative." We assume a uniform, i.e. flat, distributions, 
bounded in a physically meaningful way. 

For a single line, we use the product rules of probability theory to expand the prior: 

p{E,We,Wi\I) = p{Wi\We,E,I)p{We\E,I)p{E\I). (B1) 

2 2 

/ represents background information about the experiment, such as detector bandpass, which allows 
us to specify a flat and bounded distribution of the line-centroid energy E: 

p{E\I) = ^——. (B2) 

-'-'high ^low 

If E' < ^-Ehigh) we may specify p{We, E\I) by assuming 

H^E<r/VFi <2??(^-^iow), (B3) 

where ri ~ 1.015, the maximum value of the ratio We/Wi. A larger value of Wi would lead us to 

2 2 

infer that there is a low-energy rollover in the spectrum, and not a line candidate. Thus 

p{WeAi) = ^,^ \ . ■ (B4) 

The prior for a saturated line is simply the product of eqs. (B2) and (B4). If the line is not 
saturated, we use eq. (B3) to specify the prior for Wi : 

p(W.\We,EJ) = ^. (B5) 

2 

The product of eqs. (B2), (B4), and (B5) gives the prior for an unsaturated line. 

Assignment of the priors for two lines follows a similar procedure. We note three major 
differences: 

• if the model has harmonically spaced lines, the prior range for Ei is reduced: p{Ei\I) = 

( ^high _ 771 \-l. 
V 2 -'-'low 7 ) 

• the prior ranges for both We,i and We,2 are adapted to take into account the fact that the 
two lines cannot overlap and be identified as two separate lines; 



• 



and if the model has lines that are not harmonically spaced, then p{E2\Ei)p{Ei\I) = [(-Bhigh - 
-£'i)(-E'high — -E'low)]" • 
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Fig. 1. — Ginga Proportional Counter (PC; top) and Scintillation Counter (SC; bottom) time 
histories of GRB870303. The PC data is presented in 1 s bins, the SC data in 0.5 second bins. The 
burst triggered the recording of Ginga burst-mode data at ~ 16 s; the preceding 16 s of burst-mode 
data, in memory at the time of the trigger, were recorded and not overwritten. Burst-mode lasts 
for 64 seconds. Epochs SI (4 seconds) and S2 (9 seconds) are shown; the midpoints of SI and S2 
are separated by 22.5 seconds. 

Fig. 2. — Ginga GBD count-rate spectra for intervals SI and S2 of GRB870303, normalized by 
energy-loss bin width. 

Fig. 3. — (a): This flow chart illustrates how we select the best-fit continuum model. We begin by 
comparing the simplest model (Mi) with all alternative models that have a greater number of free 
parameters (M2-Mn), computing the significance of the decrease in s^ for each (ai^2-cii,N)- If no 
alternative model satisfies the criterion a;^2tviLR < 0.01, we select the simplest model; otherwise, we 
select the simplest alternative model that fulfills the criterion and repeat the comparison process, 
continuing until a continuum model is selected (i.e., until no alternative model satisfies the crite- 
rion), (b): This flow chart illustrates how we select the range of usable PC bins. We do not use all 
PC data because of the difficulty of modeling the spectral rollover at energy-losses ^ 5 keV. The 
box "Select Continuum Model" refers to the flow chart in (a). 

Fig. 4. — Probability contours resulting from the use of various combinations of unnormalized 
Gaussian amplitude and width (P,cr), equivalent width We, and full-width Wi, to parametrize line 

2 

shape. We show 1, 2, and 3a contours, representing 68.3%, 95.5%, and 99.7% of the integrated 
probability. Only the (W^i, VFe) parametrization contours, shown at lower left, show the elliptical 

2 

behavior required to use eq. (6) to compute the Bayesian odds. 

Fig. 5. — The ratio of line equivalent width. We, to line full-width, Wi, as a function of Gaussian 

2 

amplitude p. This ratio is not a function of the Gaussian width a. It reaches 1 (short dashed line) 
when /3 « 4.75 and peaks at r ~ 1.015 when /3 = /3o ~ 18.75 (long dashed line). We set (3 = f3o 
when fitting saturated lines to data. 

Fig. 6. — Values of the fitting statistics s^ (squares) and L (circles) as a function of cos ^inc, for fits to 
the data of GRB870303 SI (top) and GRB870303 S2 (bottom). Unfilled shapes represent continuum 
model fits, while filled shapes represent continuum-plus-line(s) model fits. Any jaggedness in fitting 
statistic values as a function of angle reflects the use of Monte Carlo simulations to create Ginga 
GBD response matrices. Jaggedness is more apparent in fits to the data of S2 because they are 
more informative (i.e. the number of counts per bin is higher for S2 than SI). 

Fig. 7. — Best-fit continuum-plus-line(s) photon number spectra (top), observed count-rate spectra 
and best-fit continuum-plus-line(s) count-rate spectra (middle), and residuals of the best-fit in units 
of a (bottom) for the intervals SI (left) and S2 (right) of GRB870303. 
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Fig. 8. — Same as Figure 6, for the joint fits to intervals SI and S2 of GRB870303. 



Fig. 9. — Best-fit continuum-plus-line(s) photon number spectra (top), observed count-rate spectra 
and best-fit continuum-plus-hne(s) count-rate spectra (middle), and residuals of the best-fit in units 
of a (bottom) for the joint fit to intervals SI (left) and S2 (right) of GRB870303. 



Fig. 10. — Best-fit photon spectra for GRB870303 SI, assuming burst photon incidence angle ^mc 
= 37.7°. The solid line shows the best-fit PL continuum model, while the dashed line shows the 
best-fit Band et al. continuum model. 
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Table 1. Continuum Models 



Model 



Formula 



Power Law (PL) 

PL X Exponential (PLE) 



Band et al. (1993) 



Broken Power Law (BPL) 



Low-Energy Rollover (NH) 



dN 
dE 



^=^i^-exp(-f 



AE-^^ exp(-f ) 



E < {a2 — Oii)Ec 



A[ia2 - ai)EX'-''' x 

exp(ai — a2)E~°''^, otherwise 



dN 
dE 



_ j AE-'^^ , E<Eb 

~ \A£;^2~"iS-"2, otherwise 

m).o,=m)^M-AxE~^) 



Note. — In the fitting code itself, it is the logarithms of the normalization A and 
PLE cutoff energy Ec that are varied. Also, a "pivot" energy of 20 keV is used to 
reduce the size of the frequentist confidence and Bayesian credible intervals for the 
highly correlated parameters A and a. These changes alter the likelihood surface in 
parameter space in such a way as to make it more closely resemble a multi-dimensional 
Gaussian function. 
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Table 2. Energy-Loss Bins and Continua Used in Analyses 

Spectrum PC Bins SC Bins Selected Continuum 

51 8-15 2-31 PL 

52 7-15 2-31 PLE 
S1+S2 10-15 2-31 PL(S1)+PLE(S2)^ 

'^The continuum model is a six-parameter model in which 
log^Si and asi vary independently of log^s2 and as2, but 
for which cos(0inc)si = cos(6'inc)s2- 
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Table 3. Exponentiated Gaussian Line Models 



Model 




Free Line Parameters 


Sl-S 




iE,WE) 


Sl-U 




{E,We,W,) 


S2-A 




{Ei,We,i) 


S2-B 




iEi,WE,l,WE,2) 


S2-C 




{Ei,We,i,Wi^^) 


S2-D 


(E, 


.,We,1,Wi „We,2,Wi ^) 



Note. — Parameters not shown have values 
set by the values of the parameters shown; e.g., 
for S2-A, E2 = 2Ei, Wi ^ = tj-^We,!, We,2 = 



2' 



2Wei, and Wir, = 77 ^We2- Not shown are 
variations on the S2 class of models for which 
£'2 / 2Ei. While these models were tested for 
completeness, none significantly improved fits. 
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Table 4. Prior Probabilities for Exponentiated Gaussian Model 



Model Prior Probability 



[2ri{E — -E'iow)(-E'high — -E'low)] "*" 

[2We{E — -E'low)(-E'high — -E'low)]^ 
[s-^lC-^high — 2E\ow)]~ 

[{27]Ei - We,i){Ei - ^iow)(^high - 2^iow)]-^ 
[^Wi^,Ei{Ehigh - 2Eia^)]-^ 

[^'^^1,1^^1,2(2^1 - ^i,i)(^i - ^iow)(^high - 2^10 



Note. — V = ( j^ I ^ 1.015. E'low and E'high represent the 

\ 5 / max 

low and high energy-loss bandpass boundaries, respectively, for the Ginga 
GBD. Not shown are the priors for variations on the S2 class of models 
for which E2 7^ 2Ei; while these models were tested, none significantly 
improved fits. See Appendix B for details. 



Sl-S 


P 


Sl-U 


P 


S2-A 


P 


S2-B 


P 


S2-C 


P 


S2-D 


P 
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Table 5. Line Significances and Odds 



Dataset Model 



(dof) 



'C+L 



(dof) 



"x^MLR 



Lc ^C+L 



Odds 



SI 


Sl-S 


42.71 (35) 


22.22 (33) 


3.6 X 10-5 


49.44 


60.55 


114:1 


S2 


S2-A 


49.94 (35) 


32.60 (33) 


1.7 X IQ-* 


89.90 


98.73 


7:1 


S1+S2 


S2-B^ 


86.18 (66) 


46.13 (62) 


4.2 X 10-^ 


126.44 


147.28 


40,300:1 



^All line parameters have the same value for SI and S2 except VFe,i,si / ^E,i,S2- 
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Table 6. Best-Fit Parameters 







Frequentist 






Bayesian 




Parameter 


SI 


S2 


S1+S2 


SI 


S2 


S1+S2 


log^si^ 


-0.77 


_ 


-0.80 


-0.84 


_ 


-0.80 


log^S2^ 


- 


-0.56 


-0.39 


- 


-0.55 


-0.39 


asi 


1.72 


- 


1.75 


1.67 


- 


1.76 


"S2 


- 


1.19 


1.48 


- 


1.19 


1.47 


Ec (keV) 


- 


2.14 


2.39 


- 


2.13 


2.38 


El (keV) 


21.4 


21.8 


21.5 


21.3 


21.8 


21.5 


We,i,si (keV) 


10.7 


- 


11.2 


10.4 


- 


11.1 


We,i,s2 (keV) 


- 


2.16 


2.82 


- 


2.21 


2.73 


We,2 (keV) 


- 


- 


2.72 


- 


- 


2.86 


COs(6linc) 


0.54 


0.82 


0.60 


0.58 


0.82 


0.60 



^Amplitudes at the "pivot" energy of 20 keV. 
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Table 7. SI: Parameter Estimation 





Par 


Preq 


Bayes 


Par 


Freq 


Bayes 


Best Fit 


log^^ 


-0.77 


-0.84 


E 


21.4 


21.3 


la 




[-1.05,-0.74] 


[-1.14,-0.84] 


(keV) 


[20.3,22.5] 


[20.0,22.5] 


2a 




[-1.23,-0.70] 


[-1.23,-0.73] 




[19.2,23.9] 


[18.7,24.0] 


3a 




[-1.29,-0.67] 


[-1.31,-0.66] 




[18.1,25.7] 


[17.5,25.8] 


Best Fit 


a 


1.72 


1.67 


We 


10.7 


10.4 


la 




[1.52,1.78] 


[1.47,1.70] 


(keV) 


[8.41,13.1] 


[7.85,12.9] 


2a 




[1.35,1.85] 


[1.37,1.80] 




[6.16,15.9] 


[5.74,15.8] 


3a 




[1.26,1.92] 


[1.27,1.90] 




[3.80,19.8] 


[4.55,17.8] 


Best Fit 








COs(6'inc) 


0.54 


0.58 


la 










[0.54,0.78] 


[0.54,0.72] 


2a 










[0.54,0.98] 


[0.54,0.95] 


3a 










[0.54,0.98] 


[0.54,0.978] 



'"Amplitude at the "pivot" energy of 20 keV. 
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Table 8. S2: Parameter Estimation 





Par 


Freq 


Bayes 


Par 


Freq 


Bayes 


Best Fit 


log^^ 


-0.56 


-0.55 


El 


21.8 


21.8 


la 




[-0.60,-0.43] 


[-0.63,-0.47] 


(keV) 


[20.3,22.5] 


[21.1,22.6] 


2a 




[-0.69,-0.34] 


[-0.70,-0.38] 




[19.2,23.9] 


[20.2,23.3] 


3a 




[-0.72,-0.28] 


[-0.73,-0.34] 




[18.1,25.7] 


[19.5,24.0] 


Best Fit 


a 


1.19 


1.19 


We,i 


2.16 


2.21 


la 




[1.10,1.32] 


[1.15,1.31] 


(keV) 


[1.73,2.62] 


[1.67,2.70] 


2a 




[0.97,1.44] 


[1.00,1.42] 




[1.20,3.15] 


[1.16,3.09] 


3a 




[0.89,1.54] 


[0.90,1.54] 




[0.63,3.66] 


[1.02,3.38] 


Best Fit 


logE, 


2.14 


2.13 


cos (6*1110) 


0.82 


0.82 


la 


(keV) 


[2.04,2.25] 


[2.04,2.25] 




[0.70,0.94] 


[0.66,0.96] 


2a 




[1.96,2.41] 


[1.94,2.39] 




[0.58,0.98] 


[0.60,0.98] 


3a 




[1.88,2.58] 


[1.84,2.57] 




[0.54,0.98] 


[0.56,0.98] 



^Amplitude at the "pivot" energy of 20 keV. 
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Table 9. S1+S2: Parameter Estimation 





Par 


Freq 


Par 


Freq 


Best Fit 


logAgi^ 


-0.80 


El 


21.5 


la 




[-0.98,-0.73] 


(keV) 


[20.8,22.1] 


2a 




[-1.11,-0.67] 




[20.1,22.6] 


3a 




[-1.22,-0.61] 




[19.3,23.2] 


Best Fit 


"SI 


1.75 


We,i,si 


11.2 


la 




[1.61,1.83] 


(keV) 


[9.15,13.4] 


2a 




[1.49,1.92] 




[7.08,16.0] 


3a 




[1.37,2.03] 




[5.02,19.2] 


Best Fit 


logAs2'' 


-0.39 


We,1,S2 


2.82 


la 




[-0.43,-0.33] 


(keV) 


[2.04,3.40] 


2a 




[-0.44,-0.24] 




[1.35,4.00] 


3a 




[-0.47,-0.22] 




[0.64,4.71] 


Best Fit 


as2 


1.48 


We,2 


2.72 


la 




[1.30,1.56] 


(keV) 


[1.38,4.06] 


2a 




[1.12,1.67] 




[0.00,5.28] 


3a 




[0.98,1.74] 




[0.00,6.44] 


Best Fit 


log^c 


2.40 


COs(6'inc) 


0.60 


la 


(keV) 


[2.22,2.59] 




[0.58,0.78] 


2a 




[2.08,2.91] 




[0.54,0.96] 


3a 




[1.97,oo] 




[0.54,0.98] 



^Amplitude at the "pivot" energy of 20 keV. 
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Table 10. Analyses of GRB870303 Line Candidates 











Model 


C 


C+L 


Number 


Sig 










Comp 


Model 


Model 


of Line 


or 


Work 


Spec 


"inc 


Stat 


Test 


Comp? 


Comp? 


Par 


Odds 


M88 


S2 


37.7° 


s'. 


F 


N^ 


N 


6 


~ 10-3 


G92 


SI 


37.7° 


4 


X^ MLR 


Y 


N 


3 


1x10"^ 




S2 








Y 


N 


6 


2x10"^ 


G93 


SI 


37.7° 


L 


Odds 


N^ 


N 


2" 


110:1 




S2 








Nd 


N 


4c 


2.8:1 


This 


SI 


Free 


s2 


X^ MLR 


Y 


Y 


2 


2.2x10-5 


Work 






L 


Odds 


N 


Y 


2 


114:1 




S2 




4 


X^ MLR 


Y 


Y 


2 


1.7x10-^ 








L 


Odds 


N 


Y 


2 


7:1 




S1+S2 




si 


X^MLR 


Y 


Y 


4 


4.2x10-^ 








L 


Odds 


N 


Y 


4 


40,300:1 



'^M88 assume a thermal cyclotron continuum. 

G93 apply the power-law continuum model used in G92. 
^^093 assume the line to be saturated. 

G93 apply the two-segment broken power-law continuum model used in G92. 
''G93 assume the energies and full-widths of the two lines to be harmonically related. 
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Table 11. Effect of Changing Continuum Model and PC Bin Range on Fits 



Dataset 


Continuum 


PC Bins 


"x^MLR 


Odds 


SI 


PL 


8-15 


1.2x10*5 


119:1 




PL 


10-14 


5.3x10-6 


339:1 




Band 


8-15 


1.6x10*6 


_a 




Band 


10-14 


3.9x10-6 


a 


S2 


PLE 


7-15 


4.0x10-5 


14:1 




PLE 


10-14 


1.1x10-5 


15:1 




Band 


7-15 


1.1x10-5 


81:1 




Band 


10-14 


4.0x10-6 


87:1 




BPL 


7-15 


2.0x10-5 


28:1 




BPL 


10-14 


5.3x10-5 


10:1 



Note. — We use the best-fit line model for each dataset, 
and assume 6*1110 = 37.7°. 

'^vdety cannot be computed because the mode is too 
close to a parameter space boundary. 
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Table 12. Effect of Increasing the Number of Line Parameters in Fits to SI and S2 





2 Par (Sl-S) 


3 Par (Sl- 


■u) 


2 Par (S2-A) 


5 Par (S2-D) 


6 Par 


s2 


44.84 


44.84 




53.91 


53.91 


53.91 


s2 


22.23 


22.23 




33.63 


31.13 


30.49 


«X^MLR 


1.2x10"^ 


4.8x10" 


-5 


4.0x10"^ 


3.7x10-^ 


6.7x10"^ 


El (keV) 


21.2 


21.2 




21.8 


21.8 


21.3 


We,i (keV) 


9.98 


9.98 




2.24 


3.35 


3.12 


Wi^^ (keV) 


[10.1] 


10.1 




[2.27] 


6.20 


5.55 


^2 (keV) 








[43.6] 


[43.6] 


44.4 


We,2 (keV) 








[5.48] 


4.23 


4.25 


m 2 (keV) 








[5.54] 


4.67 


4.80 



Note. — Values in brackets are fixed by values of other parameters (see Table 3). We 



assume ft, 



37.7°. 
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